
# HETA: Hessian-Enhanced Token Attribution

This repository provides Python modules to reproduce the **Hessian-Enhanced Token Attribution (HETA)** method, as described in the paper. 
It includes utilities for dataset loading, model loading, evaluation metrics, and attribution computation.

---

## Files

1. **HETA_dataset.py**
   - Loads datasets used in the HETA paper:
     - Long-Range Agreement (LongRA)
     - TellMeWhy
     - WikiBio
   - Functions:
     - `load_longra()`: Load LongRA dataset
     - `load_tellmewhy()`: Load TellMeWhy dataset
     - `load_wikibio()`: Load WikiBio dataset
     - `load_all()`: Load all three datasets

2. **HETA_models.py**
   - Loads pretrained models used in the HETA experiments:
     - GPT-2
     - GPT-Neo 1.3B
     - GPT-J 6B
   - Functions:
     - `load_gpt2()`: Load GPT-2 model and tokenizer
     - `load_gptneo_1_3b()`: Load GPT-Neo 1.3B model and tokenizer
     - `load_gptj_6b()`: Load GPT-J 6B model and tokenizer
     - `load_all()`: Load all three models

3. **HETA_evaluation.py**
   - Implements evaluation metrics from the paper:
     - Sensitivity
     - Active/Passive Robustness
     - F1 Alignment (human vs model)
     - AOPC (Area Over Perturbation Curve)
   - Functions:
     - `sensitivity()`
     - `active_passive_robustness()`
     - `f1_alignment()`
     - `aopc()`

4. **HETA_method.py**
   - Main implementation of the HETA attribution method.
   - Function:
     - `heta_attribution(model, tokenizer, sentence, alpha=1.0, beta=1.0, device="cpu")`
       - Computes token-level attribution for any input sentence.

---

## Installation

Install dependencies:
```bash
pip install torch transformers datasets scikit-learn scipy
```

---

## Usage

### 1. Load datasets
```python
from HETA_dataset import load_all
datasets = load_all()
print(datasets["wikibio"])
```

### 2. Load models
```python
from HETA_models import load_gpt2
model, tokenizer = load_gpt2()
```

### 3. Compute attributions
```python
from HETA_method import heta_attribution
tokens, scores = heta_attribution(model, tokenizer, "The dog chased the cat")
print(list(zip(tokens, scores)))
```

### 4. Evaluate attributions
```python
from HETA_evaluation import sensitivity, f1_alignment
print(sensitivity([[0.1,0.2,0.3],[0.2,0.3,0.4]]))
print(f1_alignment([1,2,3],[2,3,4]))
```

---

## Notes
- For large models (e.g., GPT-J), set `device="cuda"` for faster computation.
- Hessian computation is expensive; consider using subsets or approximations for long inputs.

---

## Authors
This repository was built for reproducing the **HETA** framework.
